LCM: An Efficient Algorithm for Enumerating Frequent Closed Item Sets
نویسندگان
چکیده
In this paper, we propose three algorithms LCMfreq, LCM, and LCMmax for mining all frequent sets, frequent closed item sets, and maximal frequent sets, respectively, from transaction databases. The main theoretical contribution is that we construct treeshaped transversal routes composed of only frequent closed item sets, which is induced by a parent-child relationship defined on frequent closed item sets. By traversing the route in a depth-first manner, LCM finds all frequent closed item sets in polynomial time per item set, without storing previously obtained closed item sets in memory. Moreover, we introduce several algorithmic techniques using the sparse and dense structures of input data. Algorithms for enumerating all frequent item sets and maximal frequent item sets are obtained from LCM as its variants. By computational experiments on real world and synthetic databases to compare their performance to the previous algorithms, we found that our algorithms are fast on large real world datasets with natural distributions such as KDD-cup2000 datasets, and many other synthetic databases.
منابع مشابه
An Efficient Algorithm for Enumerating Closed Patterns in Transaction Databases
The class of closed patterns is a well known condensed representations of frequent patterns, and have recently attracted considerable interest. In this paper, we propose an efficient algorithm LCM (Linear time Closed pattern Miner) for mining frequent closed patterns from large transaction databases. The main theoretical contribution is our proposed prefix-preserving closure extension of closed...
متن کاملGenerating Frequent Closed Item Sets Based on Zero-suppressed BDDs
(Abstract) Frequent item set mining is one of the fundamental techniques for knowledge discovery and data mining. In the last decade, a number of efficient algorithms for frequent item set mining have been presented, but most of them focused on just enumerating the item set patterns which satisfy the given conditions, and it was a different matter how to store and index the result of patterns f...
متن کاملLCM over ZBDDs: Fast Generation of Very Large-Scale Frequent Itemsets Using a Compact Graph-Based Representation
(Abstract) Frequent itemset mining is one of the fundamental techniques for data mining and knowledge discovery. In the last decade, a number of efficient algorithms for frequent itemset mining have been presented, but most of them focused on just enumerating the itemsets which satisfy the given conditions, and it was a different matter how to store and index the mining result for efficient dat...
متن کاملAn efficient hash based algorithm for mining closed frequent item sets
Association rule discovery has emerged as an important problem in knowledge discovery and data mining. The association mining task consists of identifying the frequent item sets, and then forming conditional implication rules among them. Efficient algorithms to discover frequent patterns are crucial in data mining research. Finding frequent item sets is computationally the most expensive step i...
متن کاملIndexed Enhancement on GenMax Algorithm for Fast and Less Memory Utilized Pruning of MFI and CFI
The essential problem in many data mining applications is mining frequent item sets such as the discovery of association rules, patterns, and many other important discovery tasks. Fast and less memory utilization for solving the problems of frequent item sets are highly required in transactional databases. Methods for mining frequent item sets have been implemented using a prefix-tree structure...
متن کامل